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IN THE CLAIMS 

Please amend the claims as follows: 

1 . (Currently Amended) In a computer system having a scalar processing unit and a vector 
processing unit, wherein the vector processing unit includes a vector dispatch unit, a method of 
executing a v e ctor m e mory instruction having a scalar operand decoupling operation of the 
scalar processing unit from that of the vector processing unit , the method comprising: 

sending a vector instruction from the scalar processing unit to the vector dispatch unit, 
wherein sending includes marking the vector instruction as complete if the vector instruction is 
not a vector memory instruction and if the vector instruction does not require scalar operands: 

reading the a scalar operand, wherein reading includes transferring the scalar operand 
from the scalar processing unit to the vector dispatch unit; 

determining predispatching the vector instruction within the vector dispatch unit if the 
vector m e mory instruction is scalar committed; asd 

dispatching the predispatched vector instruction if all required operands are ready; and 

if the vector memory instruction is scalar committ e d, executing the dispatched vector 
memory op e ration instruction as a function of the scalar operand. 

2. (Currently Amended) The method according to claim 1 , wherein executing the 
dispatched vector memory operation instruction includes translating an address associated with 
the vector memory operation instruction and trapping on a translation fault. 

3. (Currently Amended) In a computer system having a scalar processing unit and a vector 
processing unit, wherein the vector processing unit includes a vector dispatch unit, a method of 
decoupling v e ctor data loads from v e ctor instruction execution operation of the scalar processing 
unit from that of the vector processing unit, the method comprising: 

sending a vector instruction from the scalar processing unit to the vector dispatch unit, 
wherein sending includes marking the vector instruction as complete if the vector instruction is 
not a vector memory instruction and if the vector instruction does not require scalar operands; 
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reading a scalar operand, wherein reading includes transferring the scalar operand from 
the scalar processing unit to the vector dispatch unit; 

predispatching the vector instruction within the vector dispatch unit if the vector 
instruction is scalar committed; 

dispatching the predispatched vector instruction if all required operands are ready; 

generating an address for a vector load; 

issuing a vector load request to memory; 

receiving vector data from memory; 

storing the vector data in a load buffer; 

transferring the vector data from the load buffer to a vector register; and 
executing [[a]] the dispatched vector instruction on the vector data stored in the vector 
register. 

4. (Original) The method according to claim 3, wherein the vector processing unit includes 
a vector execute unit and a vector load/store unit, wherein issuing a vector load request to 
memory includes issuing and executing vector memory references in the vector load/store unit 
when the vector load store unit has received the instruction and memory operands from the scalar 
processing unit. 

5. (Currently Amended) In a computer system having a scalar processing unit and a vector 
processing unit, wherein the vector processing unit includes a vector dispatch unit, a method of 
decoupling vector data loads from vector instruction execution operation of the scalar processing 
unit from that of the vector processing unit, the method comprising: 

sending a vector instruction from the scalar processing unit to the vector dispatch unit, 
wherein sending includes marking the vector instruction as complete if the vector instruction is 
not a vector memory instruction and if the vector instruction does not require scalar operands; 

reading a scalar operand, wherein reading includes transferring the scalar operand from 
the scalar processing unit to the vector dispatch unit; 

predispatching the vector instruction within the vector dispatch unit if the vector 
instruction is scalar committed; 
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dispatching the predispatched vector instruction if all required operands are ready; 
generating a first and a second address for a vector load; 
issuing first and second vector load requests to memory; 

receiving vector data associated with the first and second addresses from memory; 
storing vector data associated with the first address in a first vector register; 
storing vector data associated with the second address in a second vector register; 
executing a vector instruction on the vector data stored in the first vector register; 
renaming the second vector register; and 

executing [[a]] the dispatched vector instruction on the vector data stored in the second 
vector register. 

6. (Currently Amended) The method according to claim [[3]] 5, wherein the vector 
processing unit includes a vector execute unit and a vector load/store unit, wherein issuing a 
vector load request to memory includes issuing and executing vector memory references in the 
vector load/store unit when the vector load store unit has received the instruction and memory 
operands from the scalar processing unit. 

7. (Currently Amended) A computer system, comprising: 
a scalar processing unit; and 

a vector processing unit, wherein the vector processing unit includes a vector dispatch 
unit, a vector execute unit and a vector load/store unit; 

wherein the scalar processing unit sends a vector instruction and a scalar operand to the 
vector dispatch unit, wherein sending includes marking the vector instruction as complete if the 
vector instruction is not a vector memory instruction and if the vector instruction does not 
require scalar operands; 

wherein the vector dispatch unit predispatches the vector instruction within the vector 
dispatch unit if the vector instruction is scalar committed and then dispatches the predispatched 
vector instruction to one or more of the vector execute unit and the vector load/store unit if all 



required operands are ready; 
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wherein the vector load/store unit receives an instruction and memory operands from the 
scalar processing unit, issues and executes a vector memory load reference as a function of the 
instruction and the memory operands received from the scalar processing unit, and stores data 
received as a result of the vector memory reference in a load buffer; and 

wherein the vector execute unit issues the vector memory load instruction and transfers 
the data received as a result of the vector memory reference from the load buffer to a vector 
register. 

8 . (Currently Amended) In a computer system having a scalar processing unit and a vector 
processing unit, a method of decoupling scalar and vector execution, comprising: 

dispatching scalar instructions to a scalar instruction queue; 

dispatching sending a vector instruction that requires scalar operands to the scalar 
instruction queue and to a vector instruction queue , wherein sending includes marking the vector 
instruction as complete if the vector instruction is not a vector memory instruction and if the 
vector instruction does not require scalar operands ; 

executing the vector instruction in the scalar processing unit, wherein executing the 
vector instruction in the scalar processing unit includes writing a scalar operand to a scalar 
operand queue; 

predispatching the vector instruction sent to the vector instruction queue if the vector 
instruction is scalar committed; 

notifying the vector processing unit that the scalar operand is available in the scalar 
operand queue; and 

dispatching the predispatched vector instruction if all required scalar operands are ready; 

and 

executing the dispatched vector instruction in the vector processing unit, wherein 
executing the vector instruction in the vector processing unit includes reading the scalar operand 
from the scalar operand queue. 
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9. (Currently Amended) In a computer system having a scalar processing unit and a vector 
processing unit, a method of decoupling generating an addr e ss for a vector memory reference 
and a vector execution , comprising: 

dispatching scalar instructions to a scalar instruction queue; 

dispatching sending a vector instruction to the scalar instruction queue and to a vector 
instruction queue , wherein sending includes marking the vector instruction as complete if the 
vector instruction is not a vector memory instruction and if the vector instruction does not 
require scalar operands ; 

executing the vector instruction in the scalar processing unit, wherein executing the 
vector instruction in the scalar processing unit includes generating an address and writing the 
address to a scalar operand queue; 

predispatching the vector instruction sent to the vector instruction queue if the vector 
instruction is scalar committed; 

notifying the vector processing unit that the address is available in the scalar operand 
queue; and 

dispatching the predispatched vector instruction if all required operands are ready; and 
executing the dispatched vector instruction in the vector processing unit, wherein 
executing the vector instruction in the vector processing unit includes reading the address from 
the scalar operand queue and generating a memory request as a function of the address read from 
the scalar operand queue. 

1 0. (Currently Amended) In a computer system having a scalar processing unit and a vector 
processing unit, a method of executing a vector instruction, comprising: 

dispatching scalar instructions to a scalar instruction queue; 

dispatching sending a vector instruction to the scalar instruction queue and to a vector 
instruction queue , wherein sending includes marking the vector instruction as complete if the 
vector instruction is not a vector memory instruction and if the vector instruction does not 
require scalar operands ; 
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executing the vector instruction in the scalar processing unit, wherein executing the 
vector instruction in the scalar processing unit includes generating an address and writing the 
address to a scalar operand queue; 

predispatching the vector instruction sent to the vector instruction queue if the vector 
instruction is scalar committed; 

notifying the vector processing unit that the address is available in the scalar operand 
queue; and 

dispatching the predispatched vector instruction if all required operands are ready; and 
executing the dispatched vector instruction in the vector processing unit, wherein 
executing the dispatched vector instruction in the vector processing unit includes: 
reading the address from the scalar operand queue; 

generating a memory request as a function of the address read from the scalar operand 

queue; 

receiving vector data from memory; 
storing the vector data in a load buffer; 

transferring the vector data from the load buffer to a vector register; and 
executing a vector instruction on the vector data stored in the vector register. 

1 1 . (Currently Amended) In a computer system having a scalar processing unit and a vector 
processing unit, a method of unrolling a loop, comprising: 

preparing a first and a second vector instruction, wherein each vector instruction execute 
an iteration through the loop and wherein each vector instruction requires calculation of a scalar 
loop value; 

dispatching sending the first and second vector instructions to the scalar instruction queue 
and to a vector instruction queue , wherein sending includes marking the vector instruction as 
complete if the vector instruction is not a vector memory instruction and if the vector instruction 
does not require scalar operands ; 

executing each vector instruction in the scalar processing unit, wherein executing each 
vector instruction in the scalar processing unit includes writing a scalar operand representing the 
scalar loop value calculated for each vector instruction to a scalar operand queue; 
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predispatching each vector instruction sent to the vector instruction queue if the vector 
instruction is scalar committed; 

notifying the vector processing unit that the scalar operand is available in the scalar 
operand queue; 

dispatching the predispatched vector instruction if all required operands are ready; and 
executing the first and second dispatched vector instructions in the vector processing unit, 
wherein executing the dispatched vector instruction in the vector processing unit includes 
reading the scalar operands associated with each instruction from the scalar operand queue. 

12. (New) The method according to claim 3, wherein storing the vector data in a load buffer 
and transferring the vector data from the load buffer to a vector register are decoupled from each 
other. 

13. (New) The method according to claim 3, wherein storing the vector data in a load buffer 
includes writing memory load data to the load buffer until all previous memory operations 
complete without fault. 

14. (New) The method according to claim 4, wherein storing the vector data in a load buffer 
and transferring the vector data from the load buffer to a vector register are decoupled from each 
other. 

15. (New) The method according to claim 4, wherein storing the vector data in a load buffer 
includes writing memory load data to the load buffer until all previous memory operations 
complete without fault. 



16. (New) The system according to claim 7, wherein the load buffer stores memory load 
data until it is determined that no previous memory operation will fail and, if no previous 
memory operations have failed, the load buffer transfers the data to the vector register. 



